229 research outputs found
LiveSketch: Query Perturbations for Guided Sketch-based Visual Search
LiveSketch is a novel algorithm for searching large image collections using
hand-sketched queries. LiveSketch tackles the inherent ambiguity of sketch
search by creating visual suggestions that augment the query as it is drawn,
making query specification an iterative rather than one-shot process that helps
disambiguate users' search intent. Our technical contributions are: a triplet
convnet architecture that incorporates an RNN based variational autoencoder to
search for images using vector (stroke-based) queries; real-time clustering to
identify likely search intents (and so, targets within the search embedding);
and the use of backpropagation from those targets to perturb the input stroke
sequence, so suggesting alterations to the query in order to guide the search.
We show improvements in accuracy and time-to-task over contemporary baselines
using a 67M image corpus.Comment: Accepted to CVPR 201
Collaborative Feature Learning from Social Media
Image feature representation plays an essential role in image recognition and
related tasks. The current state-of-the-art feature learning paradigm is
supervised learning from labeled data. However, this paradigm requires
large-scale category labels, which limits its applicability to domains where
labels are hard to obtain. In this paper, we propose a new data-driven feature
learning paradigm which does not rely on category labels. Instead, we learn
from user behavior data collected on social media. Concretely, we use the image
relationship discovered in the latent space from the user behavior data to
guide the image feature learning. We collect a large-scale image and user
behavior dataset from Behance.net. The dataset consists of 1.9 million images
and over 300 million view records from 1.9 million users. We validate our
feature learning paradigm on this dataset and find that the learned feature
significantly outperforms the state-of-the-art image features in learning
better image similarities. We also show that the learned feature performs
competitively on various recognition benchmarks
Building a Large Scale Dataset for Image Emotion Recognition: The Fine Print and The Benchmark
Psychological research results have confirmed that people can have different
emotional reactions to different visual stimuli. Several papers have been
published on the problem of visual emotion analysis. In particular, attempts
have been made to analyze and predict people's emotional reaction towards
images. To this end, different kinds of hand-tuned features are proposed. The
results reported on several carefully selected and labeled small image data
sets have confirmed the promise of such features. While the recent successes of
many computer vision related tasks are due to the adoption of Convolutional
Neural Networks (CNNs), visual emotion analysis has not achieved the same level
of success. This may be primarily due to the unavailability of confidently
labeled and relatively large image data sets for visual emotion analysis. In
this work, we introduce a new data set, which started from 3+ million weakly
labeled images of different emotions and ended up 30 times as large as the
current largest publicly available visual emotion data set. We hope that this
data set encourages further research on visual emotion analysis. We also
perform extensive benchmarking analyses on this large data set using the state
of the art methods including CNNs.Comment: 7 pages, 7 figures, AAAI 201
Robust Image Sentiment Analysis Using Progressively Trained and Domain Transferred Deep Networks
Sentiment analysis of online user generated content is important for many
social media analytics tasks. Researchers have largely relied on textual
sentiment analysis to develop systems to predict political elections, measure
economic indicators, and so on. Recently, social media users are increasingly
using images and videos to express their opinions and share their experiences.
Sentiment analysis of such large scale visual content can help better extract
user sentiments toward events or topics, such as those in image tweets, so that
prediction of sentiment from visual content is complementary to textual
sentiment analysis. Motivated by the needs in leveraging large scale yet noisy
training data to solve the extremely challenging problem of image sentiment
analysis, we employ Convolutional Neural Networks (CNN). We first design a
suitable CNN architecture for image sentiment analysis. We obtain half a
million training samples by using a baseline sentiment algorithm to label
Flickr images. To make use of such noisy machine labeled data, we employ a
progressive strategy to fine-tune the deep network. Furthermore, we improve the
performance on Twitter images by inducing domain transfer with a small number
of manually labeled Twitter images. We have conducted extensive experiments on
manually labeled Twitter images. The results show that the proposed CNN can
achieve better performance in image sentiment analysis than competing
algorithms.Comment: 9 pages, 5 figures, AAAI 201
Speeding up Context-based Sentence Representation Learning with Non-autoregressive Convolutional Decoding
Context plays an important role in human language understanding, thus it may
also be useful for machines learning vector representations of language. In
this paper, we explore an asymmetric encoder-decoder structure for unsupervised
context-based sentence representation learning. We carefully designed
experiments to show that neither an autoregressive decoder nor an RNN decoder
is required. After that, we designed a model which still keeps an RNN as the
encoder, while using a non-autoregressive convolutional decoder. We further
combine a suite of effective designs to significantly improve model efficiency
while also achieving better performance. Our model is trained on two different
large unlabelled corpora, and in both cases the transferability is evaluated on
a set of downstream NLP tasks. We empirically show that our model is simple and
fast while producing rich sentence representations that excel in downstream
tasks
Rethinking Skip-thought: A Neighborhood based Approach
We study the skip-thought model with neighborhood information as weak
supervision. More specifically, we propose a skip-thought neighbor model to
consider the adjacent sentences as a neighborhood. We train our skip-thought
neighbor model on a large corpus with continuous sentences, and then evaluate
the trained model on 7 tasks, which include semantic relatedness, paraphrase
detection, and classification benchmarks. Both quantitative comparison and
qualitative investigation are conducted. We empirically show that, our
skip-thought neighbor model performs as well as the skip-thought model on
evaluation tasks. In addition, we found that, incorporating an autoencoder path
in our model didn't aid our model to perform better, while it hurts the
performance of the skip-thought model
- …